64 research outputs found

    The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing

    Get PDF
    We describe a statistical framework for QTL mapping using bulk segregant analysis (BSA) based on high throughput, short-read sequencing. Our proposed approach is based on a smoothed version of the standard statistic, and takes into account variation in allele frequency estimates due to sampling of segregants to form bulks as well as variation introduced during the sequencing of bulks. Using simulation, we explore the impact of key experimental variables such as bulk size and sequencing coverage on the ability to detect QTLs. Counterintuitively, we find that relatively large bulks maximize the power to detect QTLs even though this implies weaker selection and less extreme allele frequency differences. Our simulation studies suggest that with large bulks and sufficient sequencing depth, the methods we propose can be used to detect even weak effect QTLs and we demonstrate the utility of this framework by application to a BSA experiment in the budding yeast Saccharomyces cerevisiae

    HAMSTER: visualizing microarray experiments as a set of minimum spanning trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Visualization tools allow researchers to obtain a global view of the interrelationships between the probes or experiments of a gene expression (<it>e.g. microarray</it>) data set. Some existing methods include hierarchical clustering and k-means. In recent years, others have proposed applying minimum spanning trees (MST) for microarray clustering. Although MST-based clustering is formally equivalent to the dendrograms produced by hierarchical clustering under certain conditions; visually they can be quite different.</p> <p>Methods</p> <p>HAMSTER (Helpful Abstraction using Minimum Spanning Trees for Expression Relations) is an open source system for generating a <b>set </b>of MSTs from the experiments of a microarray data set. While previous works have generated a single MST from a data set for data clustering, we recursively merge experiments and repeat this process to obtain a set of MSTs for data visualization. Depending on the parameters chosen, each tree is analogous to a snapshot of one step of the hierarchical clustering process. We scored and ranked these trees using one of three proposed schemes. HAMSTER is implemented in C++ and makes use of Graphviz for laying out each MST.</p> <p>Results</p> <p>We report on the running time of HAMSTER and demonstrate using data sets from the NCBI Gene Expression Omnibus (GEO) that the images created by HAMSTER offer insights that differ from the dendrograms of hierarchical clustering. In addition to the C++ program which is available as open source, we also provided a web-based version (HAMSTER<sup>+</sup>) which allows users to apply our system through a web browser without any computer programming knowledge.</p> <p>Conclusion</p> <p>Researchers may find it helpful to include HAMSTER in their microarray analysis workflow as it can offer insights that differ from hierarchical clustering. We believe that HAMSTER would be useful for certain types of gradient data sets (e.g time-series data) and data that indicate relationships between cells/tissues. Both the source and the web server variant of HAMSTER are available from <url>http://hamster.cbrc.jp/</url>.</p

    Bayesian probabilistic network modeling from multiple independent replicates

    Get PDF
    Often protein (or gene) time-course data are collected for multiple replicates. Each replicate generally has sparse data with the number of time points being less than the number of proteins. Usually each replicate is modeled separately. However, here all the information in each of the replicates is used to make a composite inference about signal networks. The composite inference comes from combining well structured Bayesian probabilistic modeling with a multi-faceted Markov Chain Monte Carlo algorithm. Based on simulations which investigate many different types of network interactions and experimental variabilities, the composite examination uncovers many important relationships within the networks. In particular, when the edge's partial correlation between two proteins is at least moderate, then the composite's posterior probability is large

    Human Gene Coexpression Landscape: Confident Network Derived from Tissue Transcriptomic Profiles

    Get PDF
    This is an open-access article distributed under the terms of the Creative Commons Attribution License.[Background]: Analysis of gene expression data using genome-wide microarrays is a technique often used in genomic studies to find coexpression patterns and locate groups of co-transcribed genes. However, most studies done at global >omic> scale are not focused on human samples and when they correspond to human very often include heterogeneous datasets, mixing normal with disease-altered samples. Moreover, the technical noise present in genome-wide expression microarrays is another well reported problem that many times is not addressed with robust statistical methods, and the estimation of errors in the data is not provided. [Methodology/Principal Findings]: Human genome-wide expression data from a controlled set of normal-healthy tissues is used to build a confident human gene coexpression network avoiding both pathological and technical noise. To achieve this we describe a new method that combines several statistical and computational strategies: robust normalization and expression signal calculation; correlation coefficients obtained by parametric and non-parametric methods; random cross-validations; and estimation of the statistical accuracy and coverage of the data. All these methods provide a series of coexpression datasets where the level of error is measured and can be tuned. To define the errors, the rates of true positives are calculated by assignment to biological pathways. The results provide a confident human gene coexpression network that includes 3327 gene-nodes and 15841 coexpression-links and a comparative analysis shows good improvement over previously published datasets. Further functional analysis of a subset core network, validated by two independent methods, shows coherent biological modules that share common transcription factors. The network reveals a map of coexpression clusters organized in well defined functional constellations. Two major regions in this network correspond to genes involved in nuclear and mitochondrial metabolism and investigations on their functional assignment indicate that more than 60% are house-keeping and essential genes. The network displays new non-described gene associations and it allows the placement in a functional context of some unknown non-assigned genes based on their interactions with known gene families. [Conclusions/Significance]: The identification of stable and reliable human gene to gene coexpression networks is essential to unravel the interactions and functional correlations between human genes at an omic scale. This work contributes to this aim, and we are making available for the scientific community the validated human gene coexpression networks obtained, to allow further analyses on the network or on some specific gene associations. The data are available free online at http://bioinfow.dep.usal.es/coexpression/. © 2008 Prieto et al.Funding and grant support was provided by the Ministery of Health, Spanish Government (ISCiii-FIS, MSyC; Project reference PI061153) and by the Ministery of Education, Castilla-Leon Local Government (JCyL; Project reference CSI03A06).Peer Reviewe

    Environmental and Genetic Determinants of Colony Morphology in Yeast

    Get PDF
    Nutrient stresses trigger a variety of developmental switches in the budding yeast Saccharomyces cerevisiae. One of the least understood of such responses is the development of complex colony morphology, characterized by intricate, organized, and strain-specific patterns of colony growth and architecture. The genetic bases of this phenotype and the key environmental signals involved in its induction have heretofore remained poorly understood. By surveying multiple strain backgrounds and a large number of growth conditions, we show that limitation for fermentable carbon sources coupled with a rich nitrogen source is the primary trigger for the colony morphology response in budding yeast. Using knockout mutants and transposon-mediated mutagenesis, we demonstrate that two key signaling networks regulating this response are the filamentous growth MAP kinase cascade and the Ras-cAMP-PKA pathway. We further show synergistic epistasis between Rim15, a kinase involved in integration of nutrient signals, and other genes in these pathways. Ploidy, mating-type, and genotype-by-environment interactions also appear to play a role in the controlling colony morphology. Our study highlights the high degree of network reuse in this model eukaryote; yeast use the same core signaling pathways in multiple contexts to integrate information about environmental and physiological states and generate diverse developmental outputs

    Phenotypic Landscape of Saccharomyces cerevisiae during Wine Fermentation: Evidence for Origin-Dependent Metabolic Traits

    Get PDF
    The species Saccharomyces cerevisiae includes natural strains, clinical isolates, and a large number of strains used in human activities. The aim of this work was to investigate how the adaptation to a broad range of ecological niches may have selectively shaped the yeast metabolic network to generate specific phenotypes. Using 72 S. cerevisiae strains collected from various sources, we provide, for the first time, a population-scale picture of the fermentative metabolic traits found in the S. cerevisiae species under wine making conditions. Considerable phenotypic variation was found suggesting that this yeast employs diverse metabolic strategies to face environmental constraints. Several groups of strains can be distinguished from the entire population on the basis of specific traits. Strains accustomed to growing in the presence of high sugar concentrations, such as wine yeasts and strains obtained from fruits, were able to achieve fermentation, whereas natural yeasts isolated from “poor-sugar” environments, such as oak trees or plants, were not. Commercial wine yeasts clearly appeared as a subset of vineyard isolates, and were mainly differentiated by their fermentative performances as well as their low acetate production. Overall, the emergence of the origin-dependent properties of the strains provides evidence for a phenotypic evolution driven by environmental constraints and/or human selection within S. cerevisiae

    Systematic identification of functional modules and cis-regulatory elements in Arabidopsis thaliana

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several large-scale gene co-expression networks have been constructed successfully for predicting gene functional modules and cis-regulatory elements in Arabidopsis (<it>Arabidopsis thaliana</it>)<it>.</it> However, these networks are usually constructed and analyzed in an <it>ad hoc</it> manner. In this study, we propose a completely parameter-free and systematic method for constructing gene co-expression networks and predicting functional modules as well as cis-regulatory elements.</p> <p>Results</p> <p>Our novel method consists of an automated network construction algorithm, a parameter-free procedure to predict functional modules, and a strategy for finding known cis-regulatory elements that is suitable for consensus scanning without prior knowledge of the allowed extent of degeneracy of the motif. We apply the method to study a large collection of gene expression microarray data in Arabidopsis. We estimate that our co-expression network has ~94% of accuracy, and has topological properties similar to other biological networks, such as being scale-free and having a high clustering coefficient. Remarkably, among the ~300 predicted modules whose sizes are at least 20, 88% have at least one significantly enriched functions, including a few extremely significant ones (ribosome, <it>p</it> < 1E-300, photosynthetic membrane, <it>p</it> < 1.3E-137, proteasome complex, <it>p</it> < 5.9E-126). In addition, we are able to predict cis-regulatory elements for 66.7% of the modules, and the association between the enriched cis-regulatory elements and the enriched functional terms can often be confirmed by the literature. Overall, our results are much more significant than those reported by several previous studies on similar data sets. Finally, we utilize the co-expression network to dissect the promoters of 19 Arabidopsis genes involved in the metabolism and signaling of the important plant hormone gibberellin, and achieved promising results that reveal interesting insight into the biosynthesis and signaling of gibberellin.</p> <p>Conclusions</p> <p>The results show that our method is highly effective in finding functional modules from real microarray data. Our application on Arabidopsis leads to the discovery of the largest number of annotated Arabidopsis functional modules in the literature. Given the high statistical significance of functional enrichment and the agreement between cis-regulatory and functional annotations, we believe our Arabidopsis gene modules can be used to predict the functions of unknown genes in Arabidopsis, and to understand the regulatory mechanisms of many genes.</p

    RNA Methylation by the MIS Complex Regulates a Cell Fate Decision in Yeast

    Get PDF
    For the yeast Saccharomyces cerevisiae, nutrient limitation is a key developmental signal causing diploid cells to switch from yeast-form budding to either foraging pseudohyphal (PH) growth or meiosis and sporulation. Prolonged starvation leads to lineage restriction, such that cells exiting meiotic prophase are committed to complete sporulation even if nutrients are restored. Here, we have identified an earlier commitment point in the starvation program. After this point, cells, returned to nutrient-rich medium, entered a form of synchronous PH development that was morphologically and genetically indistinguishable from starvation-induced PH growth. We show that lineage restriction during this time was, in part, dependent on the mRNA methyltransferase activity of Ime4, which played separable roles in meiotic induction and suppression of the PH program. Normal levels of meiotic mRNA methylation required the catalytic domain of Ime4, as well as two meiotic proteins, Mum2 and Slz1, which interacted and co-immunoprecipitated with Ime4. This MIS complex (Mum2, Ime4, and Slz1) functioned in both starvation pathways. Together, our results support the notion that the yeast starvation response is an extended process that progressively restricts cell fate and reveal a broad role of post-transcriptional RNA methylation in these decisions

    Quantitative Genetics, Pleiotropy, and Morphological Integration in the Dentition of Papio hamadryas

    Get PDF
    Variation in the mammalian dentition is highly informative of adaptations and evolutionary relationships, and consequently has been the focus of considerable research. Much of the current research exploring the genetic underpinnings of dental variation can trace its roots to Olson and Miller's 1958 book Morphological Integration. These authors explored patterns of correlation in the post-canine dentitions of the owl monkey and Hyopsodus, an extinct condylarth from the Eocene. Their results were difficult to interpret, as was even noted by the authors, due to a lack of genetic information through which to view the patterns of correlation. Following in the spirit of Olson and Miller's research, we present a quantitative genetic analysis of dental variation in a pedigreed population of baboons. We identify patterns of genetic correlations that provide insight to the genetic architecture of the baboon dentition. This genetic architecture indicates the presence of at least three modules: an incisor module that is genetically independent of the post-canine dentition, and a premolar module that demonstrates incomplete pleiotropy with the molar module. We then compare this matrix of genetic correlations to matrices of phenotypic correlations between the same measurements made on museum specimens of another baboon subspecies and the Southeast Asian colobine Presbytis. We observe moderate significant correlations between the matrices from these three primate taxa. From these observations we infer similarity in modularity and hypothesize a common pattern of genetic integration across the dental arcade in the Cercopithecoidea
    corecore